AITopics | calibration curve

Collaborating Authors

calibration curve

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

34547650b2ca69d91f3b3c3ae8b21962-Paper-Conference.pdf

Neural Information Processing SystemsFeb-10-2026, 23:26:05 GMT

artificial intelligence, machine learning, natural language, (18 more...)

Neural Information Processing Systems

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
Asia > Middle East > Lebanon > Keserwan-Jbeil Governorate > Blat (0.05)
Europe > France (0.04)
(5 more...)

Genre:

Research Report > Experimental Study (1.00)
Research Report > New Finding (0.92)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Beyond Accuracy: An Empirical Study of Uncertainty Estimation in Imputation

Hossain, Zarin Tahia, Milani, Mostafa

arXiv.org Artificial IntelligenceNov-27-2025

Handling missing data is a central challenge in data-driven analysis. Modern imputation methods not only aim for accurate reconstruction but also differ in how they represent and quantify uncertainty. Yet, the reliability and calibration of these uncertainty estimates remain poorly understood. This paper presents a systematic empirical study of uncertainty in imputation, comparing representative methods from three major families: statistical (MICE, SoftImpute), distribution alignment (OT-Impute), and deep generative (GAIN, MIWAE, TabCSDI). Experiments span multiple datasets, missingness mechanisms (MCAR, MAR, MNAR), and missingness rates. Uncertainty is estimated through three complementary routes: multi-run variability, conditional sampling, and predictive-distribution modeling, and evaluated using calibration curves and the Expected Calibration Error (ECE). Results show that accuracy and calibration are often misaligned: models with high reconstruction accuracy do not necessarily yield reliable uncertainty. We analyze method-specific trade-offs among accuracy, calibration, and runtime, identify stable configurations, and offer guidelines for selecting uncertainty-aware imputers in data cleaning and downstream machine learning pipelines.

data quality, imputation, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2511.21607

Country: North America > Canada (0.28)

Genre: Research Report > New Finding (1.00)

Industry: Health & Medicine (0.67)

Technology:

Information Technology > Data Science > Data Quality (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Add feedback

Enforcing Calibration in Multi-Output Probabilistic Regression with Pre-rank Regularization

Desobry, Naomi, Zhalieva, Elnura, Taieb, Souhaib Ben

arXiv.org Machine LearningOct-28-2025

Probabilistic models must be well calibrated to support reliable decision-making. While calibration in single-output regression is well studied, defining and achieving multivariate calibration in multi-output regression remains considerably more challenging. The existing literature on multivariate calibration primarily focuses on diagnostic tools based on pre-rank functions, which are projections that reduce multivariate prediction-observation pairs to univariate summaries to detect specific types of miscalibration. In this work, we go beyond diagnostics and introduce a general regularization framework to enforce multivariate calibration during training for arbitrary pre-rank functions. This framework encompasses existing approaches such as highest density region calibration and copula calibration. Our method enforces calibration by penalizing deviations of the projected probability integral transforms (PITs) from the uniform distribution, and can be added as a regularization term to the loss function of any probabilistic predictor. Specifically, we propose a regularization loss that jointly enforces both marginal and multivariate pre-rank calibration. We also introduce a new PCA-based pre-rank that captures calibration along directions of maximal variance in the predictive distribution, while also enabling dimensionality reduction. Across 18 real-world multi-output regression datasets, we show that unregularized models are consistently miscalibrated, and that our methods significantly improve calibration across all pre-rank functions without sacrificing predictive accuracy.

artificial intelligence, calibration, machine learning, (16 more...)

arXiv.org Machine Learning

2510.21273

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.87)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.34)

Add feedback

34547650b2ca69d91f3b3c3ae8b21962-Paper-Conference.pdf

Neural Information Processing SystemsOct-9-2025, 22:55:04 GMT

dataset, kermut, kernel, (15 more...)

Neural Information Processing Systems

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
Europe > France (0.04)
Europe > Denmark > Capital Region > Copenhagen (0.04)
(4 more...)

Genre:

Research Report > Experimental Study (1.00)
Research Report > New Finding (0.92)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Machine Learning-based Early Detection of Potato Sprouting Using Electrophysiological Signals

Andreoletti, Davide, Marcolongo, Aris, Djukic, Natasa Sarafijanovic, Roulet, Julien, Billeter, Stefano, Kurenda, Andrzej, Visse-Mansiaux, Margot, Dupuis, Brice, Plummer, Carrol Annette, Paoli, Beatrice, Ayoub, Omran

arXiv.org Artificial IntelligenceJul-2-2025

Accurately predicting potato sprouting before the emergence of any visual signs is critical for effective storage management, as sprouting degrades both the commercial and nutritional value of tubers. Effective forecasting allows for the precise application of anti-sprouting chemicals (ASCs), minimizing waste and reducing costs. This need has become even more pressing following the ban on Isopropyl N-(3-chlorophenyl) carbamate (CIPC) or Chlorpropham due to health and environmental concerns, which has led to the adoption of significantly more expensive alternative ASCs. Existing approaches primarily rely on visual identification, which only detects sprouting after morphological changes have occurred, limiting their effectiveness for proactive management. A reliable early prediction method is therefore essential to enable timely intervention and improve the efficiency of post-harvest storage strategies, where early refers to detecting sprouting before any visible signs appear. In this work, we address the problem of early prediction of potato sprouting. To this end, we propose a novel machine learning (ML)-based approach that enables early prediction of potato sprouting using electrophysiological signals recorded from tubers using proprietary sensors. Our approach preprocesses the recorded signals, extracts relevant features from the wavelet domain, and trains supervised ML models for early sprouting detection. Additionally, we incorporate uncertainty quantification techniques to enhance predictions. Experimental results demonstrate promising performance in the early detection of potato sprouting by accurately predicting the exact day of sprouting for a subset of potatoes and while showing acceptable average error across all potatoes. Despite promising results, further refinements are necessary to minimize prediction errors, particularly in reducing the maximum observed deviations.

artificial intelligence, machine learning, prediction, (15 more...)

arXiv.org Artificial Intelligence

2507.00862

Genre: Research Report > New Finding (0.66)

Industry:

Food & Agriculture > Agriculture (0.68)
Health & Medicine > Therapeutic Area (0.47)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

JAPAN: Joint Adaptive Prediction Areas with Normalising-Flows

English, Eshant, Lippert, Christoph

arXiv.org Machine LearningMay-30-2025

Conformal prediction provides a model-agnostic framework for uncertainty quantification with finite-sample validity guarantees, making it an attractive tool for constructing reliable prediction sets. However, existing approaches commonly rely on residual-based conformity scores, which impose geometric constraints and struggle when the underlying distribution is multimodal. In particular, they tend to produce overly conservative prediction areas centred around the mean, often failing to capture the true shape of complex predictive distributions. In this work, we introduce JAPAN (Joint Adaptive Prediction Areas with Normalising-Flows), a conformal prediction framework that uses density-based conformity scores. By leveraging flow-based models, JAPAN estimates the (predictive) density and constructs prediction areas by thresholding on the estimated density scores, enabling compact, potentially disjoint, and context-adaptive regions that retain finite-sample coverage guarantees. We theoretically motivate the efficiency of JAPAN and empirically validate it across multivariate regression and forecasting tasks, demonstrating good calibration and tighter prediction areas compared to existing baselines. We also provide several \emph{extensions} adding flexibility to our proposed framework.

artificial intelligence, joint adaptive prediction area, machine learning, (15 more...)

arXiv.org Machine Learning

2505.23196

Country:

Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.14)
North America > United States > California > San Francisco County > San Francisco (0.14)
Europe > Germany (0.04)
Europe > Finland > Uusimaa > Helsinki (0.04)

Genre: Research Report (1.00)

Industry:

Health & Medicine (0.68)
Energy (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Data Science (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.48)

Add feedback

Uncertainty quantification with approximate variational learning for wearable photoplethysmography prediction tasks

Bench, Ciaran, Desai, Vivek, Moulaeifard, Mohammad, Strodthoff, Nils, Aston, Philip, Thompson, Andrew

arXiv.org Artificial IntelligenceMay-19-2025

Photoplethysmography (PPG) signals encode information about relative changes in blood volume that can be used to assess various aspects of cardiac health non-invasively, e.g.\ to detect atrial fibrillation (AF) or predict blood pressure (BP). Deep networks are well-equipped to handle the large quantities of data acquired from wearable measurement devices. However, they lack interpretability and are prone to overfitting, leaving considerable risk for poor performance on unseen data and misdiagnosis. Here, we describe the use of two scalable uncertainty quantification techniques: Monte Carlo Dropout and the recently proposed Improved Variational Online Newton. These techniques are used to assess the trustworthiness of models trained to perform AF classification and BP regression from raw PPG time series. We find that the choice of hyperparameters has a considerable effect on the predictive performance of the models and on the quality and composition of predicted uncertainties. E.g. the stochasticity of the model parameter sampling determines the proportion of the total uncertainty that is aleatoric, and has varying effects on predictive performance and calibration quality dependent on the chosen uncertainty quantification technique and the chosen expression of uncertainty. We find significant discrepancy in the quality of uncertainties over the predicted classes, emphasising the need for a thorough evaluation protocol that assesses local and adaptive calibration. This work suggests that the choice of hyperparameters must be carefully tuned to balance predictive performance and calibration quality, and that the optimal parameterisation may vary depending on the chosen expression of uncertainty.

artificial intelligence, epistemic uncertainty, machine learning, (20 more...)

arXiv.org Artificial Intelligence

2505.11412

Country:

Europe > Germany > Lower Saxony > Oldenburg (0.04)
North America > United States > Colorado (0.04)
Asia > South Korea > Seoul > Seoul (0.04)
Asia > Middle East > Kuwait > Ahmadi Governorate > Al Ahmadi (0.04)

Genre:

Research Report (1.00)
Overview (0.67)

Industry:

Health & Medicine > Therapeutic Area > Hematology (1.00)
Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (1.00)
Health & Medicine > Diagnostic Medicine (1.00)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
(2 more...)

Add feedback

A comprehensive review of classifier probability calibration metrics

Lane, Richard Oliver

arXiv.org Machine LearningApr-25-2025

Probabilities or confidence values produced by artificial intelligence (AI) and machine learning (ML) models often do not reflect their true accuracy, with some models being under or over confident in their predictions. For example, if a model is 80% sure of an outcome, is it correct 80% of the time? Probability calibration metrics measure the discrepancy between confidence and accuracy, providing an independent assessment of model calibration performance that complements traditional accuracy metrics. Understanding calibration is important when the outputs of multiple systems are combined, for assurance in safety or business-critical contexts, and for building user trust in models. This paper provides a comprehensive review of probability calibration metrics for classifier and object detection models, organising them according to a number of different categorisations to highlight their relationships. We identify 82 major metrics, which can be grouped into four classifier families (point-based, bin-based, kernel or curve-based, and cumulative) and an object detection family. For each metric, we provide equations where available, facilitating implementation and comparison by future researchers.

calibration, data mining, machine learning, (20 more...)

arXiv.org Machine Learning

2504.18278

Country: